Network analysis of long non-coding RNA expression profiles in common warts

Background Long non-coding RNAs (lncRNAs) have been the subject of considerable attention in recent years due to their role in gene regulation. However, the function of lncRNAs remains poorly understood, especially in the context of infection with low-risk human papillomaviruses (HPVs). To further understanding on this issue, we investigated lncRNA expression in HPV-induced common warts. Methods A publicly available high-throughput sequencing dataset for common warts was downloaded from the Gene Expression Omnibus (GEO). lncRNA profiles were generated using the NetworkAnalyst 3.0 workflow, and a list of differentially expressed (DE) lncRNAs in common warts was identified and inputted into the ENCODE, RegNetwork, DisGeNet, and miRNet platforms. Results A total of 54 lncRNAs were revealed to be significantly dysregulated in common warts. Of these 54 lncRNAs, 24 and 30 were upregulated and downregulated, respectively. The most significantly differentially expressed lncRNAs in common warts included the CERNA2, LINC02159, SH3PXD2A-AS1, and UNC5B-AS1 genes. Conclusion The current findings suggest that HPV-induced warts impact the host lncRNA transcriptome. To the best of our knowledge, the present study is the first to explore the impact of low-risk HPV infection on lncRNA expression profiles.


Introduction
Long non-coding RNAs (lncRNAs) are a heterogeneous group of non-protein-coding RNA transcripts that are greater than 200 nucleotides in length and which function in many biological processes as well as diseases [1]. The function of the majority of identified lncRNAs in humans remains to be elucidated, but it is clear that lncRNAs are involved in the regulation of gene expression [2,3]. lncRNA dysregulation has been associated with various cancers in humans and can be induced in response to infection [4]. Viral infection can also influence the host genome to express lncRNAs that are essential for viral pathogenesis [5,6]. For example, human papillomavirus (HPV) infection has been reported to widely alter the lncRNA transcriptome of its host [7].
The HPVs are a family of non-enveloped DNA viruses with a preferential tropism for epithelial tissue [8]. More specifically, HPVs infect keratinocytes, the most common epidermal cells, and their replication cycle is tightly linked to keratinocyte differentiation [9]. With over 200 types identified, HPVs are divided into high-risk and low-risk groups based on their oncogenic potential [10]. In immunocompetent individuals, low-risk HPV infection is self-limiting and primarily manifests in the form of benign hyper-proliferative epithelial lesions known as warts [10]. Wart formation typically entails the enlargement of the epithelial cells, thickening of the epithelium, folding of the basal epithelial layer, and cornification of the outer epithelial layer [11]. The most prevalent type of cutaneous wart is the common wart (Verruca vulgaris), which accounts for 70% of warts and is preferentially located on the hands [12].
Despite affecting nearly 1 in 10 people, warts have not been the subject of much genetic research, and current treatments mostly rely on damaging or destroying the infected epithelia [13,14]. In addition, little is known about how lncRNA expression profiles are impacted by low-risk HPV infection. Therefore, the aim of this study was to explore how HPV-induced common warts modulated host expression of lncRNAs.

Data acquisition
The dataset investigated in the present study was obtained from The National Center for Biotechnology Information's (NCBI) Gene Expression Omnibus (GEO) repository (accession number GSE136347). The data contained paired epidermal tissue samples from warts (n ¼ 12) and normal skin (n ¼ 12) that were profiled on the Illumina HiSeq 2500 platform. According to the original study, total RNA was extracted from the samples using an RNeasy Mini Kit (Qiagen, Germany), and the purity of the samples was assessed using the Agilent Bioanalyzer (Agilent, USA) [15]. All the samples analyzed in the present study were obtained from Jordanian Arab males.

Data processing
Quality control, filtering, normalization, differential expression analysis, and data visualization were carried out according to the Net-workAnalyst 3.0 workflow [16,17,18,19,20]. Quality control was performed to check the distribution of raw read counts, the sum of read counts for each sample, similarities and differences between the samples, and the relative distribution of different counts in each group.
Both variance and abundance filtering were applied to the dataset. The 15 th percentile of genes with the lowest variance (whose expression values did not change across the samples) and the 4 th percentile of genes with the lowest abundance were removed. Based on this filtering step, 37 genes with low variance and 0 genes with low abundance were excluded from downstream analysis.
The data was uploaded to the Network Analyst 3.0 web platform and normalized via log2 transformation. The normalized data was displayed in a principal component analysis (PCA) plot and visually inspected for outliers ( Figure 1). Differential expression was identified and quantified using EdgeR, where a lncRNA's expression was deemed significant if it had an adjusted p-value of <0.05, resulting in 229 differentially expressed (DE) lncRNAs. Upon applying a cutoff of log2 fold change of >1 (equal to a fold change of 2 on the original scale), a total of 54 lncRNAs were identified as significant.

Integrative pathway analysis
Gene ontology (GO) analysis was performed as part of the Networ-kAnalyst 3.0 workflow to determine which biological processes, cellular components, and molecular functions were enriched in common warts.

Interaction network analysis
2.4.1. Using data from integrated knowledge bases As part of the NetworkAnalyst 3.0 workflow, the list of DE lncRNAs were inputted into the ENCODE, RegNetwork, DisGeNet, and miRNet platforms. ENCODE was used to obtain information about interactions between transcription factors and gene targets, while coregulatory interaction data was obtained from RegNetwork. The DisGeNet database was used to determine lncRNA-disease association data, and miRNet was used to predict the interactions between DE lncRNAs and miRNA targets [21,22,23].

Using data from previously published studies
The interaction between DE lncRNA, DE miRNA and DE genes in common warts was carried out using miRNet 2.0, a network-based visual analytic tool for miRNA functional analysis [24]. A total of 54 DE lncRNA, 6 DE miRNA [25], and 3,140 DE genes [15] were uploaded to miRNet 2.0 and submitted for analysis.

Validation of the differentially expressed lncRNA genes
A list of the most differentially methylated (DM) genes [26] in common warts from a previous study was merged with the significantly DE lncRNA genes. Overlapping genes, i.e., those that had significant levels of differential expression and methylation, were considered to be validated. Table 1 shows the 54 lncRNAs that were significantly DE (logFC>1 and adj. p-value<0.05) in common warts. The average GeneCards   Inferred Functionality Score (GIFtS) for these lncRNA genes was 15 out of 100, indicating a low level of knowledge about their functionality. Figure 2 depicts a volcano plot of the 229 significantly DE lncRNAs in common warts, which have an adjusted p-value of less than 0.05. When adding a significance cutoff of logFC < -1 or >1, 54 genes are found to be significantly DE, which are depicted in a clustered heatmap (Figure 3).

Identification of differentially expressed genes
Lastly, Figure 4 shows the distribution of DE lncRNAs on each chromosome.

Pathway and ontology analysis
No KEGG and Reactome pathways were found to be associated with the 229 significantly DE lncRNAs in common warts. Similarly, gene    ontology (GO) term enrichment analysis did not yield any significant GO terms. Figure 5 shows the analysis of the interaction network between transcription factors and lncRNAs in common warts.

Disease-lncRNA associations
3 diseases have been found to be associated with HOTAIR, while MIR4435-2HG is associated with lung neoplasms. Figure 6 displays the diseases associated with the HOTAIR lncRNA gene.

lncRNA-miRNA interactions
Out of 54 DE lncRNA, 28 DE lncRNAs were mapped to 523 miRNAs. The interaction network included 551 nodes and 1,263 edges. Due to the large number of nodes and edges, only the 28 most DE lncRNAs with the common shared miRNAs are presented in Figures 7A and 7B. Figure 8 shows the top 10 most DE lncRNAs with at least 49 connections with miRNAs on the original interaction network.

lncRNA-miRNA interactions
The result of the analysis revealed a total of 7,137 nodes, of which 4,423 are genes, 2,686 are miRNAs, and 28 are lncRNAs. To simplify the network visualization, the 28 lncRNAs were extracted with their common shared miRNA (104 nodes) and genes (24 nodes) ( Figures 9A and 9B).

Validation of differentially expressed lncRNAs
Merger of the most DE and DM lncRNAs in common warts revealed 21 overlapping lncRNA genes ( Table 2).

Discussion
Previous evidence indicated that lncRNAs are deregulated by the high-risk HPV proteins E6 and E7 [7]. However, the role and impact of E6 and E7 proteins from low-risk HPV types remain poorly understood compared to their high-risk counterparts [11]. Here, we found that 54 lncRNAs were DE in common warts, 24 of which were upregulated and 30 of which were downregulated. The most significantly DE lncRNAs in common warts were CERNA2, LINC02159, SH3PXD2A-AS1, and UNC5B-AS1. In common warts, the most downregulated lncRNA was UNC5B-AS1 and the most upregulated lncRNA was MIR100HG.
CERNA2, also known as HOST2, was the most significantly DE lncRNA in common warts and was downregulated. CERNA2 acts as an oncogenic lncRNA, and silencing of CERNA2 expression was found to inhibit the proliferative, migratory, and invasive capacities of breast,  cervical, gastric, and osteosarcoma cancer cells [27,28,29,30]. It was also suggested to increase epithelial-mesenchymal transition in hepatocellular carcinoma tissues via the JAK2-STAT3 signaling pathway [31]. Compared to normal skin, CERNA2 was significantly upregulated in psoriatic skin and positively correlated with STAT1, a gene whose expression is inhibited by HPV proteins for episome maintenance and genome amplification [32,33]. Additionally, CERNA2 was up-regulated in HPV-positive cervical cancer cells, resulting in the down-regulation of its target microRNA let-7b [34]. LINC02159, also known as LOC285629, was DE in thyroid cancer, HER2 þ HR À breast cancer, and metastatic melanoma, and it was identified as a part of lncRNA signatures in colorectal cancer as well as head and neck squamous cell carcinoma [35,36,37,38,39,40]. LINC02159 was classified as a low-risk gene for esophageal squamous cell  carcinoma, and it is significantly associated with colorectal cancer prognosis [41]. Evidence suggests that SH3PXD2A-AS1 acts as an oncogene in colorectal cancer and esophageal squamous cell carcinoma, with SH3PXD2A-AS1 knockdown resulting in the inhibition of colorectal tumor growth and migration [42,43,44,45]. SH3PXD2A-AS1 was markedly upregulated in psoriatic skin, and it was identified as part of a positive feedback loop that regulated the proliferation of keratinocytes and affected the pathogenesis of psoriasis [32,46].
Our findings showed that MALAT1 had the highest number of interactions with microRNAs and protein-coding genes that were previously found to be DE in common warts. Upregulated MALAT1 expression was identified in high-risk HPV-positive cervical squamous tissues and cell lines, and MALAT1 was suggested to modulate cellular apoptosis, proliferation, migration, and invasion [64,65,66,67]. Previous findings have also indicated that MALAT1 induces epithelial-mesenchymal transition in breast, cervical, colorectal, esophageal, and prostate cancer cells as well as in non-transformed cell lines, namely ARPE-19 cells, HK-2 cells, and lens epithelial cells [68,69,70,71,72,73,74,75,76]. High-risk HPV16 was reported to regulate MALAT1 expression in cervical cancer cells, whereby MALAT1 overexpression was dependent on the expression of the HPV16 E7 oncoprotein [77].
As highlighted above, several of the DE genes in common warts were previously reported to be dysregulated in psoriatic skin. Certain HPV types are highly prevalent in psoriatic skin, and a nationwide populationbased cohort study in Taiwan indicated that there is a possible link between HPV infection and the risk of developing psoriasis [78,79,80]. The STAT3 gene, which plays a critical role in psoriasis pathogenesis, is essentially activated by high-risk HPVs in both differentiated and undifferentiated keratinocytes [46,81].
Likewise, many of the DE genes in our study previously reported to be dysregulated in colorectal cancer. Like psoriasis, HPV positivity resulted in an increased risk of colorectal carcinoma, and HPV infection has been suggested to be a co-factor in the pathogenesis of colorectal cancer [82,83]. Small population studies have shown that colorectal infection with high-risk HPV types is common in colorectal cancer patients, but large population studies have failed to replicate this finding [84,85,86,87]. In any case, the exact role that HPV plays in colorectal tumor carcinogenesis is remains to be determined.

Conclusion
Our findings indicate that HPV-induced warts impact the expression profiles of certain lncRNA genes. To the best of our knowledge, the present study is the first to explore the impact of low-risk HPV infection on lncRNA expression profiles. Nonetheless, our findings are limited by the relatively small sample size as well as the all-male patient cohort, and they must be validated in a larger cohort. Moreover, more research needs to be carried out in order to determine whether the changes in lncRNA expression were induced as part of the host's response to infection or from the HPV itself.

Author contribution statement
Amneh H. Tarkhan, Mansour A. Alghamdi: Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Laith N. AL-Eitan: Conceived and designed the experiments; Performed the experiments; Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data; Wrote the paper.
Rami Q. Alkhatib: Analyzed and interpreted the data; Contributed reagents, materials, analysis tools or data.

Funding statement
This work was supported by the Deanship of Scientific Research at King Khalid University through the Small Groups Project (grant number: RGP. 1/66/43). This work was also supported by the Deanship of Research at Jordan University of Science and Technology (177/2017).

Data availability statement
Data associated with this study has been deposited at NCBI GEO under the accession number GSE136347.

Declaration of interests statement
The authors declare no conflict of interest.

Additional information
No additional information is available for this paper.